313 research outputs found

    The Tree Inclusion Problem: In Linear Space and Faster

    Full text link
    Given two rooted, ordered, and labeled trees PP and TT the tree inclusion problem is to determine if PP can be obtained from TT by deleting nodes in TT. This problem has recently been recognized as an important query primitive in XML databases. Kilpel\"ainen and Mannila [\emph{SIAM J. Comput. 1995}] presented the first polynomial time algorithm using quadratic time and space. Since then several improved results have been obtained for special cases when PP and TT have a small number of leaves or small depth. However, in the worst case these algorithms still use quadratic time and space. Let nSn_S, lSl_S, and dSd_S denote the number of nodes, the number of leaves, and the %maximum depth of a tree S{P,T}S \in \{P, T\}. In this paper we show that the tree inclusion problem can be solved in space O(nT)O(n_T) and time: O(\min(l_Pn_T, l_Pl_T\log \log n_T + n_T, \frac{n_Pn_T}{\log n_T} + n_{T}\log n_{T})). This improves or matches the best known time complexities while using only linear space instead of quadratic. This is particularly important in practical applications, such as XML databases, where the space is likely to be a bottleneck.Comment: Minor updates from last tim

    Hardness of Exact Distance Queries in Sparse Graphs Through Hub Labeling

    Full text link
    A distance labeling scheme is an assignment of bit-labels to the vertices of an undirected, unweighted graph such that the distance between any pair of vertices can be decoded solely from their labels. An important class of distance labeling schemes is that of hub labelings, where a node vGv \in G stores its distance to the so-called hubs SvVS_v \subseteq V, chosen so that for any u,vVu,v \in V there is wSuSvw \in S_u \cap S_v belonging to some shortest uvuv path. Notice that for most existing graph classes, the best distance labelling constructions existing use at some point a hub labeling scheme at least as a key building block. Our interest lies in hub labelings of sparse graphs, i.e., those with E(G)=O(n)|E(G)| = O(n), for which we show a lowerbound of n2O(logn)\frac{n}{2^{O(\sqrt{\log n})}} for the average size of the hubsets. Additionally, we show a hub-labeling construction for sparse graphs of average size O(nRS(n)c)O(\frac{n}{RS(n)^{c}}) for some 0<c<10 < c < 1, where RS(n)RS(n) is the so-called Ruzsa-Szemer{\'e}di function, linked to structure of induced matchings in dense graphs. This implies that further improving the lower bound on hub labeling size to n2(logn)o(1)\frac{n}{2^{(\log n)^{o(1)}}} would require a breakthrough in the study of lower bounds on RS(n)RS(n), which have resisted substantial improvement in the last 70 years. For general distance labeling of sparse graphs, we show a lowerbound of 12O(logn)SumIndex(n)\frac{1}{2^{O(\sqrt{\log n})}} SumIndex(n), where SumIndex(n)SumIndex(n) is the communication complexity of the Sum-Index problem over ZnZ_n. Our results suggest that the best achievable hub-label size and distance-label size in sparse graphs may be Θ(n2(logn)c)\Theta(\frac{n}{2^{(\log n)^c}}) for some 0<c<10<c < 1

    Squarepants in a Tree: Sum of Subtree Clustering and Hyperbolic Pants Decomposition

    Full text link
    We provide efficient constant factor approximation algorithms for the problems of finding a hierarchical clustering of a point set in any metric space, minimizing the sum of minimimum spanning tree lengths within each cluster, and in the hyperbolic or Euclidean planes, minimizing the sum of cluster perimeters. Our algorithms for the hyperbolic and Euclidean planes can also be used to provide a pants decomposition, that is, a set of disjoint simple closed curves partitioning the plane minus the input points into subsets with exactly three boundary components, with approximately minimum total length. In the Euclidean case, these curves are squares; in the hyperbolic case, they combine our Euclidean square pants decomposition with our tree clustering method for general metric spaces.Comment: 22 pages, 14 figures. This version replaces the proof of what is now Lemma 5.2, as the previous proof was erroneou

    A simple and optimal ancestry labeling scheme for trees

    Full text link
    We present a lgn+2lglgn+3\lg n + 2 \lg \lg n+3 ancestry labeling scheme for trees. The problem was first presented by Kannan et al. [STOC 88'] along with a simple 2lgn2 \lg n solution. Motivated by applications to XML files, the label size was improved incrementally over the course of more than 20 years by a series of papers. The last, due to Fraigniaud and Korman [STOC 10'], presented an asymptotically optimal lgn+4lglgn+O(1)\lg n + 4 \lg \lg n+O(1) labeling scheme using non-trivial tree-decomposition techniques. By providing a framework generalizing interval based labeling schemes, we obtain a simple, yet asymptotically optimal solution to the problem. Furthermore, our labeling scheme is attained by a small modification of the original 2lgn2 \lg n solution.Comment: 12 pages, 1 figure. To appear at ICALP'1

    2-Vertex Connectivity in Directed Graphs

    Full text link
    We complement our study of 2-connectivity in directed graphs, by considering the computation of the following 2-vertex-connectivity relations: We say that two vertices v and w are 2-vertex-connected if there are two internally vertex-disjoint paths from v to w and two internally vertex-disjoint paths from w to v. We also say that v and w are vertex-resilient if the removal of any vertex different from v and w leaves v and w in the same strongly connected component. We show how to compute the above relations in linear time so that we can report in constant time if two vertices are 2-vertex-connected or if they are vertex-resilient. We also show how to compute in linear time a sparse certificate for these relations, i.e., a subgraph of the input graph that has O(n) edges and maintains the same 2-vertex-connectivity and vertex-resilience relations as the input graph, where n is the number of vertices.Comment: arXiv admin note: substantial text overlap with arXiv:1407.304

    Compressed Subsequence Matching and Packed Tree Coloring

    Get PDF
    We present a new algorithm for subsequence matching in grammar compressed strings. Given a grammar of size nn compressing a string of size NN and a pattern string of size mm over an alphabet of size σ\sigma, our algorithm uses O(n+nσw)O(n+\frac{n\sigma}{w}) space and O(n+nσw+mlogNlogwocc)O(n+\frac{n\sigma}{w}+m\log N\log w\cdot occ) or O(n+nσwlogw+mlogNocc)O(n+\frac{n\sigma}{w}\log w+m\log N\cdot occ) time. Here ww is the word size and occocc is the number of occurrences of the pattern. Our algorithm uses less space than previous algorithms and is also faster for occ=o(nlogN)occ=o(\frac{n}{\log N}) occurrences. The algorithm uses a new data structure that allows us to efficiently find the next occurrence of a given character after a given position in a compressed string. This data structure in turn is based on a new data structure for the tree color problem, where the node colors are packed in bit strings.Comment: To appear at CPM '1

    Labeling Schemes for Bounded Degree Graphs

    Full text link
    We investigate adjacency labeling schemes for graphs of bounded degree Δ=O(1)\Delta = O(1). In particular, we present an optimal (up to an additive constant) logn+O(1)\log n + O(1) adjacency labeling scheme for bounded degree trees. The latter scheme is derived from a labeling scheme for bounded degree outerplanar graphs. Our results complement a similar bound recently obtained for bounded depth trees [Fraigniaud and Korman, SODA 10], and may provide new insights for closing the long standing gap for adjacency in trees [Alstrup and Rauhe, FOCS 02]. We also provide improved labeling schemes for bounded degree planar graphs. Finally, we use combinatorial number systems and present an improved adjacency labeling schemes for graphs of bounded degree Δ\Delta with (e+1)n<Δn/5(e+1)\sqrt{n} < \Delta \leq n/5

    Tree Compression with Top Trees Revisited

    Get PDF
    We revisit tree compression with top trees (Bille et al, ICALP'13) and present several improvements to the compressor and its analysis. By significantly reducing the amount of information stored and guiding the compression step using a RePair-inspired heuristic, we obtain a fast compressor achieving good compression ratios, addressing an open problem posed by Bille et al. We show how, with relatively small overhead, the compressed file can be converted into an in-memory representation that supports basic navigation operations in worst-case logarithmic time without decompression. We also show a much improved worst-case bound on the size of the output of top-tree compression (answering an open question posed in a talk on this algorithm by Weimann in 2012).Comment: SEA 201

    A simpler and more efficient algorithm for the next-to-shortest path problem

    Full text link
    Given an undirected graph G=(V,E)G=(V,E) with positive edge lengths and two vertices ss and tt, the next-to-shortest path problem is to find an stst-path which length is minimum amongst all stst-paths strictly longer than the shortest path length. In this paper we show that the problem can be solved in linear time if the distances from ss and tt to all other vertices are given. Particularly our new algorithm runs in O(VlogV+E)O(|V|\log |V|+|E|) time for general graphs, which improves the previous result of O(V2)O(|V|^2) time for sparse graphs, and takes only linear time for unweighted graphs, planar graphs, and graphs with positive integer edge lengths.Comment: Partial result appeared in COCOA201

    Dynamic and Multi-functional Labeling Schemes

    Full text link
    We investigate labeling schemes supporting adjacency, ancestry, sibling, and connectivity queries in forests. In the course of more than 20 years, the existence of logn+O(loglog)\log n + O(\log \log) labeling schemes supporting each of these functions was proven, with the most recent being ancestry [Fraigniaud and Korman, STOC '10]. Several multi-functional labeling schemes also enjoy lower or upper bounds of logn+Ω(loglogn)\log n + \Omega(\log \log n) or logn+O(loglogn)\log n + O(\log \log n) respectively. Notably an upper bound of logn+5loglogn\log n + 5\log \log n for adjacency+siblings and a lower bound of logn+loglogn\log n + \log \log n for each of the functions siblings, ancestry, and connectivity [Alstrup et al., SODA '03]. We improve the constants hidden in the OO-notation. In particular we show a logn+2loglogn\log n + 2\log \log n lower bound for connectivity+ancestry and connectivity+siblings, as well as an upper bound of logn+3loglogn+O(logloglogn)\log n + 3\log \log n + O(\log \log \log n) for connectivity+adjacency+siblings by altering existing methods. In the context of dynamic labeling schemes it is known that ancestry requires Ω(n)\Omega(n) bits [Cohen, et al. PODS '02]. In contrast, we show upper and lower bounds on the label size for adjacency, siblings, and connectivity of 2logn2\log n bits, and 3logn3 \log n to support all three functions. There exist efficient adjacency labeling schemes for planar, bounded treewidth, bounded arboricity and interval graphs. In a dynamic setting, we show a lower bound of Ω(n)\Omega(n) for each of those families.Comment: 17 pages, 5 figure
    corecore